Submodules module

class, client_address, server)[source]

Bases: socketserver.BaseRequestHandler

A simple server to load the w2v model and handle expand requests from the ui

static annotate(text, seed)[source]
handle()[source] module

Script that prepares the input corpus for np2vec training: it runs NP extractor on the corpus and marks extracted NP’s., nlp_parser, chunker)[source][source]

Give a span, determine the its group and return the normalized text representing the group

Parameters:spacy_span (spacy.tokens.Span) –[source], marked_corpus_file, nlp_parser, lines_count, chunker, mark_char='_', grouping=False)[source], old_id, diff_id)[source] module

class, binary=False, word_ngrams=False, grouping=False, light_grouping=False, grouping_map_dir=None)[source]

Bases: object

Set expansion module, given a trained np2vec model.

expand(seed, topn=500)[source]

Given a seed of terms, return the expanded set of terms.

  • seed – seed terms
  • topn – maximal number of expanded terms to return

up to topn expanded terms and their probabilities


Return the vocabulary as the list of terms.

Returns:the list of terms.
seed2term_similarity(seed_id, term_id)[source]

Compute cosine similarity between a seed terms and a term. :param seed_id: seed term id’s :param term_id: the term id

Returns:Similarity between the seed terms and the term
similarity(terms, seed, threshold)[source]
term2id(term, suffix=True)[source]

Given an term, return its id.

Parameters:term (str) – term (noun phrase)
Returns:its id (if is part of the model)
term2term_similarity(term_id_1, term_id_2)[source]

Compute cosine similarity between two term id’s. :param term_id_1: first term id :param term_id_2: second term id

Returns:Similarity between the first and second term id’s

Module contents